首页> 外文OA文献 >PCS: Predictive Component-level Scheduling for Reducing Tail Latency in Cloud Online Services
【2h】

PCS: Predictive Component-level Scheduling for Reducing Tail Latency in Cloud Online Services

机译:pCs:用于减少尾部延迟的预测性组件级调度   云在线服务

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Modern latency-critical online services often rely on composing results froma large number of server components. Hence the tail latency (e.g. the 99thpercentile of response time), rather than the average, of these componentsdetermines the overall service performance. When hosted on a cloud environment,the components of a service typically co-locate with short batch jobs toincrease machine utilizations, and share and contend resources such as cachesand I/O bandwidths with them. The highly dynamic nature of batch jobs in termsof their workload types and input sizes causes continuously changingperformance interference to individual components, hence leading to theirlatency variability and high tail latency. However, existing techniques eitherignore such fine-grained component latency variability when managing serviceperformance, or rely on executing redundant requests to reduce the taillatency, which adversely deteriorate the service performance when load getsheavier. In this paper, we propose PCS, a predictive and component-levelscheduling framework to reduce tail latency for large-scale, parallel onlineservices. It uses an analytical performance model to simultaneously predict thecomponent latency and the overall service performance on different nodes. Basedon the predicted performance, the scheduler identifies straggling componentsand conducts near-optimal component-node allocations to adapt to the changingperformance interferences from batch jobs. We demonstrate that, using realisticworkloads, the proposed scheduler reduces the component tail latency by anaverage of 67.05\% and the average overall service latency by 64.16\% comparedwith the state-of-the-art techniques on reducing tail latency.
机译:现代的延迟关键型在线服务通常依赖于来自大量服务器组件的合成结果。因此,这些组件的尾部等待时间(例如响应时间的99%)而不是平均值决定了整体服务性能。当托管在云环境中时,服务的组件通常与短批处理作业并置以提高计算机利用率,并与它们共享和争用诸如缓存和I / O带宽之类的资源。批处理作业在其工作负载类型和输入大小方面具有高度动态性,从而导致对单个组件的性能干扰不断变化,从而导致其延迟可变性和高拖尾延迟。但是,现有技术要么在管理服务性能时忽略了此类细粒度的组件延迟变化,要么依靠执行冗余请求来减少拖尾延迟,这在负载增加时会不利地降低服务性能。在本文中,我们提出了PCS,一种预测性和组件级的调度框架,可减少大规模并行在线服务的尾部等待时间。它使用分析性能模型来同时预测组件延迟和不同节点上的整体服务性能。调度程序基于预测的性能来识别散乱的组件,并进行接近最佳的组件节点分配,以适应批处理作业不断变化的性能干扰。我们证明,与最新的减少尾部等待时间的技术相比,拟议的调度程序使用现实的工作量,平均将组件尾部等待时间减少了67.05%,平均总服务等待时间减少了64.16%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号